- 
                Notifications
    You must be signed in to change notification settings 
- Fork 64
feat: Display JSON columns in anywidget mode #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Check out this pull request on   See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In b/448126500 I suggest investigating the errors that happen when visualizing STRUCT columns, but this PR doesn't test such cases.
43a938c    to
    4a33ccf      
    Compare
  
    30dfa7d    to
    237c134      
    Compare
  
    | # anywdiget mode uses the same display logic as the "deferred" mode | ||
| # for faster execution | ||
| if opts.repr_mode in ("deferred", "anywidget"): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert this.
f18fd9e    to
    8601e52      
    Compare
  
    533541a    to
    860db3e      
    Compare
  
    This commit migrates the `round_op` operator from the Ibis compiler to the SQLGlot compiler.
* add error handling for audio_transcribe * add error handling for pdf functions * add eror handling for image functions * final touch * restore rename * update notebook to better reflect our new code change * return None on error with verbose=False for image functions * define typing module in udf * only use local variable * Refactor code
* feat: support INFORMATION_SCHEMA tables in read_gbq * avoid storage semi executor * use faster tables for peek tests * more tests * fix mypy * Update bigframes/session/_io/bigquery/read_gbq_table.py * immediately query for information_schema tables * Fix mypy errors and temporarily update python version * snapshot * snapshot again
This reverts commit db5d8ea.

When displaying a DataFrame containing JSON columns (including nested JSON in lists or structs), the
anywidgettable would fail to render and fall back to the "Computation deferred" message.This was caused by a limitation in PyArrow (apache/arrow#45262), which raises an
ArrowNotImplementedErrorwhen attempting to create an empty Arrow array from an extension type likedb_dtypes.JSONArrowType. TheTableWidgetinitialization triggers this error when creating an empty DataFrame to build the table structure before fetching data.This commit introduces a workaround in
bigframes.core.blocks.to_pandas_batches. Before creating the empty DataFrame for the widget, the code now:JSONArrowTypein the schema withpyarrow.stringto create a "safe" dtype.pandas.Seriesusing this safe dtype.This approach avoids the PyArrow error while preserving the correct schema for the DataFrame.
Additionally, the existing conversion of JSON data to strings in
bigframes.session.executor.pyis retained to handle data correctly during processing after it is fetched from BigQuery.Fixes #<448126500 and 453561268> 🦕